Avoiding the itemset closure computation ”pitfall”
نویسندگان
چکیده
Extracting generic bases of association rules seems to be a promising issue in order to present informative and compact user addedvalue knowledge. However, extracting generic bases requires partially ordering costly computed itemset closures. To avoid the nightmarish itemset closure computation cost, specially for sparse contexts, we introduce an algorithm, called Prince, allowing an astute extraction of generic bases of association rules. The Prince algorithm main originality is that the partial order is maintained between frequent minimal generators and no more between frequent closed itemsets. A structure called minimal generator lattice is then built, from which the derivation of itemset closures and generic association rules becomes straightforward. An intensive experimental evaluation, carried out on benchmarking and ”worst case” datasets, showed that Prince largely outperforms the pioneer algorithms, i.e., Close, A-Close and Titanic.
منابع مشابه
E-fwarm: Enhanced Fuzzy-based Weighted Association Rule Mining Algorithm
In the Association Rule Mining (ARM) approach, equal weight is assigned to all itemsets in the dataset. Hence, it is not appropriate for all datasets. The weight should be assigned based on the significance of each itemset. The WARM reduces extra steps during the generation of rules. As, the Weighted ARM (WARM) uses the significance of each itemset, it is applied in the data mining. The Fuzzy-b...
متن کاملThe Hows, Whys, and Whens of Constraints in Itemset and Rule Discovery
Many researchers in our community (this author included) regularly emphasize the role constraints play in improving performance of data-mining algorithms. This emphasis has led to remarkable progress -current algorithms allow an incredibly rich and varied set of hidden patterns to be efficiently elicited from massive datasets, even under the burden of NP-hard problem definitions and disk-reside...
متن کاملLCM ver.3: Collaboration of Array, Bitmap and Pre x Tree for Frequent Itemset Mining
ABSTRACT For a transaction database, a frequent itemset is an itemset included in at least a specified number of transactions. To find all the frequent itemsets, the heaviest task is the computation of frequency of each candidate itemset. In the previous studies, there are roughly three data structures and algorithms for the computation: bitmap, prefix tree, and array lists. Each of these has i...
متن کاملA Fast Algorithm for Mining Share-Frequent Itemsets
Itemset share has been proposed as a measure of the importance of itemsets for mining association rules. The value of the itemset share can provide useful information such as total profit or total customer purchased quantity associated with an itemset in database. The discovery of share-frequent itemsets does not have the downward closure property. Existing algorithms for discovering share-freq...
متن کاملPrivacy-Preserving Frequent Itemset Mining for Sparse and Dense Data
Frequent itemset mining is a task that can in turn be used for other purposes such as associative rule mining. One problem is that the data may be sensitive, and its owner may refuse to give it for analysis in plaintext. There exist many privacy-preserving solutions for frequent itemset mining, but in any case enhancing the privacy inevitably spoils the efficiency. Leaking some less sensitive i...
متن کامل